Causal Email Threading

نویسندگان

  • Nir Ailon
  • Zohar S. Karnin
  • Edo Liberty
چکیده

Viewing emails as parts of a sequence or a thread is a convenient way to quickly understand their context. It is therefore one of the most widely implemented and used features in mail clients, both web based and otherwise. Alas, current threading techniques are essentially limited to personal conversations which are human generated emails. In this paper we present the first, to our knowledge, system for threading machine generated emails which account for more than 60% of mail traffic. We present a three stage process. First, we identify email templates. These are almost identical messages sent to many users by the same vendor or service provider. Next, we build a template causality graph. Given this graph, one can assess for two emails what is the likelihood that one caused or triggered the other. Finally, using the causality graph, we find the most likely thread structure for each user’s inbound mail stream. We present thorough experimental results obtained by analyzing 2.5 million user inboxes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Mobile User Interface for Threading, Marking, and Previewing Email

Email has become a vital center for managing work, and has a growing role on mobile devices. However, the challenge of creating user interfaces to support the complex ways people manage their email are exacerbated by the limitations of mobile devices, notably in screen size and input mechanisms. We address this challenge by enhancing a PDA’s inbox with user interface tools for viewing and navig...

متن کامل

Threading Electronic Mail - A Preliminary Study

Tools for processing e-mail and other electronic messages should be able to recognize and manipulate threads, that is, conversations among two or more people carried out by exchange of messages. While user clients typically insert in messages structural information useful for recovering threads, inconsistencies between clients, loose standards, creative user behavior, and the subjective nature ...

متن کامل

A Publicly Available Annotated Corpus for Supervised Email Summarization

Annotated email corpora are necessary for evaluation and training of machine learning summarization techniques. The scarcity of corpora has been a limiting factor for research in this field. We describe our process of creating a new annotated email thread corpus that will be made publicly available. We present the trade-offs of the different annotation methods that could be used.

متن کامل

Email Thread Reassembly Using Similarity Matching

Email thread reassembly is the task of linking messages by parentchild relationships. In this paper, we present two approaches to address this problem. One exploits previously undocumented header information from the Microsoft Exchange Protocol. The other uses string similarity metrics and a heuristic algorithm to reassemble threads in the absence of header information. The pros and cons of bot...

متن کامل

Studies of Automated Collection of Email Records Technical

There are few quantitative techniques for directly measuring email use patterns. This paper describes an automated tool that, with a user’s permission, reads their mail database to create a one-time snapshot and gathers relevant structural and behavioral information. We successfully collected important statistics about message threading, folders, and mail volume. Our techniques are relevant to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012